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              Applicability of the Babel Routing Protocol

Abstract

   Babel is a routing protocol based on the distance-vector algorithm
   augmented with mechanisms for loop avoidance and starvation
   avoidance.  This document describes a number of niches where Babel
   has been found to be useful and that are arguably not adequately
   served by more mature protocols.
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1.  Introduction and Background

   Babel [RFC8966] is a routing protocol based on the familiar distance-
   vector algorithm (sometimes known as distributed Bellman-Ford)
   augmented with mechanisms for loop avoidance (there is no "counting
   to infinity") and starvation avoidance.  This document describes a
   number of niches where Babel is useful and that are arguably not
   adequately served by more mature protocols such as OSPF [RFC5340] and
   IS-IS [RFC1195].

1.1.  Technical Overview of the Babel Protocol

   At its core, Babel is a distance-vector protocol based on the
   distributed Bellman-Ford algorithm, similar in principle to RIP
   [RFC2453] but with two important extensions: provisions for sensing
   of neighbour reachability, bidirectional reachability, and link
   quality, and support for multiple address families (e.g., IPv6 and
   IPv4) in a single protocol instance.

   Algorithms of this class are simple to understand and simple to
   implement, but unfortunately they do not work very well -- they
   suffer from "counting to infinity", a case of pathologically slow
   convergence in some topologies after a link failure.  Babel uses a
   mechanism pioneered by the Enhanced Interior Gateway Routing Protocol
   (EIGRP) [DUAL] [RFC7868], known as "feasibility", which avoids
   routing loops and therefore makes counting to infinity impossible.

   Feasibility is a conservative mechanism, one that not only avoids all
   looping routes but also rejects some loop-free routes.  Thus, it can
   lead to a situation known as "starvation", where a router rejects all
   routes to a given destination, even those that are loop-free.  In
   order to recover from starvation, Babel uses a mechanism pioneered by
   the Destination-Sequenced Distance-Vector Routing Protocol (DSDV)
   [DSDV] and known as "sequenced routes".  In Babel, this mechanism is
   generalised to deal with prefixes of arbitrary length and routes
   announced at multiple points in a single routing domain (DSDV was a
   pure mesh protocol, and only carried host routes).

   In DSDV, the sequenced routes algorithm is slow to react to a
   starvation episode.  In Babel, starvation recovery is accelerated by
   using explicit requests (known as "seqno requests" in the protocol)
   that signal a starvation episode and cause a new sequenced route to
   be propagated in a timely manner.  In the absence of packet loss,
   this mechanism is provably complete and clears the starvation in time
   proportional to the diameter of the network, at the cost of some
   additional signalling traffic.

2.  Properties of the Babel Protocol

   This section describes the properties of the Babel protocol as well
   as its known limitations.

2.1.  Simplicity and Implementability

   Babel is a conceptually simple protocol.  It consists of a familiar
   algorithm (distributed Bellman-Ford) augmented with three simple and
   well-defined mechanisms (feasibility, sequenced routes, and explicit
   requests).  Given a sufficiently friendly audience, the principles
   behind Babel can be explained in 15 minutes, and a full description
   of the protocol can be done in 52 minutes (one microcentury).

   An important consequence is that Babel is easy to implement.  At the
   time of writing, there exist four independent, interoperable
   implementations, including one that was reportedly written and
   debugged in just two nights.

2.2.  Robustness

   The fairly strong properties of the Babel protocol (convergence, loop
   avoidance, and starvation avoidance) rely on some reasonably weak
   properties of the network and the metric being used.  The most
   significant are:

      causality:  the "happens-before" relation is acyclic (intuitively,
         a control message is not received before it has been sent);

      strict monotonicity of the metric:  for any metric M and link
         cost C, M < C + M (intuitively, this implies that cycles have a
         strictly positive metric);

      left-distributivity of the metric:  for any metrics M and M' and
         cost C, if M <= M', then C + M <= C + M' (intuitively, this
         implies that a good choice made by a neighbour B of a node A is
         also a good choice for A).

   See [METAROUTING] for more information about these properties and
   their consequences.

   In particular, Babel does not assume a reliable transport, it does
   not assume ordered delivery, it does not assume that communication is
   transitive, and it does not require that the metric be discrete
   (continuous metrics are possible, for example, reflecting packet loss
   rates).  This is in contrast to link-state routing protocols such as
   OSPF [RFC5340] or IS-IS [RFC1195], which incorporate a reliable
   flooding algorithm and make stronger requirements on the underlying
   network and metric.

   These weak requirements make Babel a robust protocol:

      robust with respect to unusual networks:  an unusual network (non-
         transitive links, unstable link costs, etc.) is likely not to
         violate the assumptions of the protocol;

      robust with respect to novel metrics:  an unusual metric
         (continuous, constantly fluctuating, etc.) is likely not to
         violate the assumptions of the protocol.

   Section 3 gives examples of successful deployments of Babel that
   illustrate these properties.

   These robustness properties have important consequences for the
   applicability of the protocol: Babel works (more or less efficiently)
   in a range of circumstances where traditional routing protocols don't
   work well (or at all).

2.3.  Extensibility

   Babel's packet format has a number of features that make the protocol
   extensible (see Appendix D of [RFC8966]), and a number of extensions
   have been designed to make Babel work better in situations that were
   not envisioned when the protocol was initially designed.  The ease of
   extensibility is not an accident, but a consequence of the design of
   the protocol: it is reasonably easy to check whether a given
   extension violates the assumptions on which Babel relies.

   All of the extensions designed to date interoperate with the base
   protocol and with each other.  This, again, is a consequence of the
   protocol design: in order to check that two extensions to the Babel
   protocol are interoperable, it is enough to verify that the
   interaction of the two does not violate the base protocol's
   assumptions.

   Notable extensions deployed to date include:

   *  source-specific routing (also known as Source-Address Dependent
      Routing, SADR) [BABEL-SS] allows forwarding to take a packet's
      source address into account, thus enabling a cheap form of
      multihoming [SS-ROUTING];

   *  RTT-based routing [BABEL-RTT] minimises link delay, which is
      useful in overlay network (where both hop count and packet loss
      are poor metrics).

   Some other extensions have been designed but have not seen deployment
   in production (and their usefulness is yet to be demonstrated):

   *  frequency-aware routing [BABEL-Z] aims to minimise radio
      interference in wireless networks;

   *  ToS-aware routing [BABEL-TOS] allows routing to take a packet's
      Type of Service (ToS) marking into account for selected routes
      without incurring the full cost of a multi-topology routing
      protocol.

2.4.  Limitations

   Babel has some undesirable properties that make it suboptimal or even
   unusable in some deployments.

2.4.1.  Periodic Updates

   The main mechanisms used by Babel to reconverge after a topology
   change are reactive: triggered updates, triggered retractions and
   explicit requests.  However, Babel relies on periodic updates to
   clear pathologies after a mobility event or in the presence of heavy
   packet loss.  The use of periodic updates makes Babel unsuitable in
   at least two kinds of environments:

      large, stable networks:  since Babel sends periodic updates even
         in the absence of topology changes, in well-managed, large,
         stable networks the amount of control traffic will be reduced
         by using a protocol that uses a reliable transport (such as
         OSPF, IS-IS, or EIGRP);

      low-power networks:  the periodic updates use up battery power
         even when there are no topology changes and no user traffic,
         which makes Babel wasteful in low-power networks.

2.4.2.  Full Routing Table

   While there exist techniques that allow a Babel speaker to function
   with a partial routing table (e.g., by learning just a default route
   or, more generally, performing route aggregation), Babel is designed
   around the assumption that every router has a full routing table.  In
   networks where some nodes are too constrained to hold a full routing
   table, it might be preferable to use a protocol that was designed
   from the outset to work with a partial routing table (such as the Ad
   hoc On-Demand Distance Vector (AODV) routing protocol [RFC3561], the
   IPv6 Routing Protocol for Low-Power and Lossy Networks (RPL)
   [RFC6550], or the Lightweight On-demand Ad hoc Distance-vector
   Routing Protocol - Next Generation (LOADng) [LOADng]).

2.4.3.  Slow Aggregation

   Babel's loop-avoidance mechanism relies on making a route unreachable
   after a retraction until all neighbours have been guaranteed to have
   acted upon the retraction, even in the presence of packet loss.
   Unless the second algorithm described in Section 3.5.5 of [RFC8966]
   is implemented, this entails that a node is unreachable for a few
   minutes after the most specific route to it has been retracted.  This
   delay makes Babel slow to recover from a topology change in networks
   that perform automatic route aggregation.

3.  Successful Deployments of Babel

   This section gives a few examples of environments where Babel has
   been successfully deployed.

3.1.  Heterogeneous Networks

   Babel is able to deal with both classical, prefix-based ("Internet-
   style") routing and flat ("mesh-style") routing over non-transitive
   link technologies.  Just like traditional distance-vector protocols,
   Babel is able to carry prefixes of arbitrary length, to suppress
   redundant announcements by applying the split-horizon optimisation
   where applicable, and can be configured to filter out redundant
   announcements (manual aggregation).  Just like specialised mesh
   protocols, Babel doesn't by default assume that links are transitive
   or symmetric, can dynamically compute metrics based on an estimation
   of link quality, and carries large numbers of host routes efficiently
   by omitting common prefixes.

   Because of these properties, Babel has seen a number of successful
   deployments in medium-sized heterogeneous networks, networks that
   combine a wired, aggregated backbone with meshy wireless bits at the
   edges.

   Efficient operation in heterogeneous networks requires the
   implementation to distinguish between wired and wireless links, and
   to perform link quality estimation on wireless links.

3.2.  Large-Scale Overlay Networks

   The algorithms used by Babel (loop avoidance, hysteresis, delayed
   updates) allow it to remain stable in the presence of unstable
   metrics, even in the presence of a feedback loop.  For this reason,
   it has been successfully deployed in large-scale overlay networks,
   built out of thousands of tunnels spanning continents, where it is
   used with a metric computed from links' latencies.

   This particular application depends on the extension for RTT-
   sensitive routing [DELAY-BASED].

3.3.  Pure Mesh Networks

   While Babel is a general-purpose routing protocol, it has been shown
   to be competitive with dedicated routing protocols for wireless mesh
   networks [REAL-WORLD] [BRIDGING-LAYERS].  Although this particular
   niche is already served by a number of mature protocols, notably the
   Optimized Link State Routing Protocol with Expected Transmission
   Count (OLSR-ETX) and OLSRv2 (OLSR Version 2) [RFC7181] (equipped
   e.g., with the Directional Airtime (DAT) metric [RFC7779]), Babel has
   seen a moderate amount of successful deployment in pure mesh
   networks.

3.4.  Small Unmanaged Networks

   Because of its small size and simple configuration, Babel has been
   deployed in small, unmanaged networks (e.g., home and small office
   networks), where it serves as a more efficient replacement for RIP
   [RFC2453], over which it has two significant advantages: the ability
   to route multiple address families (IPv6 and IPv4) in a single
   protocol instance and good support for using wireless links for
   transit.

4.  Security Considerations

   As is the case in all distance-vector routing protocols, a Babel
   speaker receives reachability information from its neighbours, which
   by default is trusted by all nodes in the routing domain.

   At the time of writing, the Babel protocol is usually run over a
   network that is secured either at the physical layer (e.g.,
   physically protecting Ethernet sockets) or at the link layer (using a
   protocol such as Wi-Fi Protected Access 2 (WPA2)).  If Babel is being
   run over an unprotected network, then the routing traffic needs to be
   protected using a sufficiently strong cryptographic mechanism.

   At the time of writing, two such mechanisms have been defined.
   Message Authentication Code (MAC) authentication for Babel (Babel-
   MAC) [RFC8967] is a simple and easy to implement mechanism that only
   guarantees authenticity, integrity, and replay protection of the
   routing traffic and only supports symmetric keying with a small
   number of keys (typically just one or two).  Babel-DTLS [RFC8968] is
   a more complex mechanism that requires some minor changes to be made
   to a typical Babel implementation and depends on a DTLS stack being
   available, but inherits all of the features of DTLS, notably
   confidentiality, optional replay protection, and the ability to use
   asymmetric keys.

   Due to its simplicity, Babel-MAC should be the preferred security
   mechanism in most deployments, with Babel-DTLS available for networks
   that require its additional features.

   In addition to the above, the information that a mobile Babel node
   announces to the whole routing domain is often sufficient to
   determine a mobile node's physical location with reasonable
   precision.  This might make Babel unapplicable in scenarios where a
   node's location is considered confidential.
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